An Integrated Data Mining System to Automate Discovery of Measures of Association

نویسندگان

  • Cecil Eng Huang Chua
  • Roger H. L. Chiang
  • Ee-Peng Lim
چکیده

Many data analysts require tools which can integrate their database management packages (e.g. Microsoft Access) with their data analysis ones (e.g. SAS, SPSS), and provide guidance for the selection of appropriate mining algorithms. In addition, the analysts need to extract and validate statistical results to facilitate data mining. In this paper, we describe an integrated data mining system called the Linear Correlation Discovery System (LCDS) that meets the above requirement. LCDS consists of four major subcomponents, two of which, the selection assistant and the statistics coupler, are discussed in this paper. The former examines the schema and instances to determine appropriate association measurement functions (e.g. chi-square, linear regression, ANOVA). The latter invokes the appropriate statistical function on a sample data set, and extracts relevant statistical output such as η, and R for effective mining of data. We also describe a new validation algorithm based on measuring the consistency of mining results applied to multiple test sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Employing data mining to explore association rules in drug addicts

Drug addiction is a major social, economic, and hygienic challenge that impacts on all the community and needs serious threat. Available treatments are successful only in short-term unless underlying reasons making individuals prone to the phenomenon are not investigated. Nowadays, there are some treatment centers which have comprehensive information about addicted people. Therefore, given the ...

متن کامل

Automatic Discovery of Technology Networks for Industrial-Scale R&D IT Projects via Data Mining

Industrial-Scale R&D IT Projects depend on many sub-technologies which need to be understood and have their risks analysed before the project can begin for their success. When planning such an industrial-scale project, the list of technologies and the associations of these technologies with each other is often complex and form a network. Discovery of this network of technologies is time consumi...

متن کامل

Applying a decision support system for accident analysis by using data mining approach: A case study on one of the Iranian manufactures

Uncertain and stochastic states have been always taken into consideration in the fields of risk management and accident, like other fields of industrial engineering, and have made decision making difficult and complicated for managers in corrective action selection and control measure approach. In this research, huge data sets of the accidents of a manufacturing and industrial unit have been st...

متن کامل

A data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing)

Training and adaption of employees are time and money consuming. Employees’ turnover can be predicted by their organizational and personal historical data in order to reduce probable loss of organizations. Prediction methods are highly related to human resource management to obtain patterns by historical data. This article implements knowledge discovery steps on real data of a manufacturing pla...

متن کامل

Expert Discovery: A web mining approach

Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000